This is my PM566 HW5 website.


Plot of number of new cases vs date outside of China

#remove data that do not have number of case record
newcase_data=new[-(which(is.na(new$case_in_country))),c(2,3,5)]
dim(newcase_data)
## [1] 887   3
#compute new cases for each country
a=as.data.frame(table(newcase_data$`reporting date`))

a %>% 
  plot_ly(x = ~Var1, y = ~Freq, type = "scatter", mode = "lines") %>%
  layout(xaxis = list(title="Date"), yaxis = list(title="Number of cases"), title = "Number of new cases outside of China")

From the plot we can see that, at the beginning 2 months of the COVID-29 pandemic, number of new cases outside of China is increasing as time increase. It can tell us that the virus has already been spread out around the world.

Evaluate the length of incubation period.

new %>% plot_ly(y = ~exposure_start, type = "box", name = "Exposure Start Date") %>% 
  add_trace(y = ~symptom_onset, type = "box", name = "Symptom_onset Date")

We can see there is around 14 days between the median of symptom onset date and exposure start date, which is the same as the incubation period of the COVID-19 virus.


Plot of the COVID cases outside of China

b=as.data.frame(table(newcase_data$country))
b %>%
  group_by(Var1) %>%
  plot_ly(labels = ~ Var1, values = ~Freq, textposition = "inside") %>%
  add_pie(hole = 0.6) %>%
  layout(title = "Donus Chatrt of total numbe cases outside of China")

We can see that, countries who have large number of new cases are those located around China. This can well explain that China might be the original place of the COVID-19 virus.